Self-Supervised Learning for Semi-Supervised Temporal Language Grounding

نویسندگان

چکیده

Given a text description, Temporal Language Grounding (TLG) aims to localize temporal boundaries of the segments that contain specified semantics in an untrimmed video. TLG is inherently challenging task, as it requires comprehensive understanding both sentence and video contents. Previous works either tackle this task fully-supervised setting large amount annotations or weakly-supervised usually cannot achieve satisfactory performance. Since manual are expensive, cope with limited annotations, we semi-supervised way by incorporating self-supervised learning, propose S elf- upervised emi- xmlns:xlink="http://www.w3.org/1999/xlink">T emporal xmlns:xlink="http://www.w3.org/1999/xlink">L anguage xmlns:xlink="http://www.w3.org/1999/xlink">G rounding (S $^{4}$ TLG). S consists two parts: (1) A pseudo label generation module adaptively produces instant labels for unlabeled samples based on predictions from teacher model; (2) feature learning inter-modal intra-modal contrastive losses learn representations under constraints content consistency video-text alignment. We conduct extensive experiments ActivityNet-CD-OOD Charades-CD-OOD datasets. The results demonstrate our proposed can competitive performance compared state-of-the-art methods while only requiring small portion annotations.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal Ensembling for Semi-Supervised Learning

In this paper, we present a simple and efficient method for training deep neural networks in a semi-supervised setting where only a small portion of training data is labeled. We introduce self-ensembling, where we form a consensus prediction of the unknown labels using the outputs of the network-in-training on different epochs, and most importantly, under different regularization and input augm...

متن کامل

Semi-Supervised Learning for Natural Language Processing

The amount of unlabeled linguistic data available to us is much larger and growing much faster than the amount of labeled data. Semi-supervised learning algorithms combine unlabeled data with a small labeled training set to train better models. This tutorial emphasizes practical applications of semisupervised learning; we treat semi-supervised learning methods as tools for building effective mo...

متن کامل

Semi-Supervised Learning for Natural Language

Statistical supervised learning techniques have been successful for many natural language processing tasks, but they require labeled datasets, which can be expensive to obtain. On the other hand, unlabeled data (raw text) is often available “for free” in large quantities. Unlabeled data has shown promise in improving the performance of a number of tasks, e.g. word sense disambiguation, informat...

متن کامل

Self-Train LogitBoost for Semi-supervised Learning

Semi-supervised classification methods are based on the use of unlabeled data in combination with a smaller set of labeled examples, in order to increase the classification rate compared with the supervised methods, in which the total training is executed only by the usage of labeled data. In this work, a self-train Logitboost algorithm is presented. The self-train process improves the results ...

متن کامل

a semi-supervised human action learning

exploiting multimodal information like acceleration and heart rate is a promising method to achieve human action recognition. a semi-supervised action recognition approach aucc (action understanding with combinational classifier) using the diversity of base classifiers to create a high-quality ensemble for multimodal human action recognition is proposed in this paper. furthermore, both labeled ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Multimedia

سال: 2022

ISSN: ['1520-9210', '1941-0077']

DOI: https://doi.org/10.1109/tmm.2022.3228167